235 research outputs found

    Efficient AUC Optimization for Information Ranking Applications

    Full text link
    Adequate evaluation of an information retrieval system to estimate future performance is a crucial task. Area under the ROC curve (AUC) is widely used to evaluate the generalization of a retrieval system. However, the objective function optimized in many retrieval systems is the error rate and not the AUC value. This paper provides an efficient and effective non-linear approach to optimize AUC using additive regression trees, with a special emphasis on the use of multi-class AUC (MAUC) because multiple relevance levels are widely used in many ranking applications. Compared to a conventional linear approach, the performance of the non-linear approach is comparable on binary-relevance benchmark datasets and is better on multi-relevance benchmark datasets.Comment: 12 page

    On Making Good Games - Using Player Virtue Ethics and Gameplay Design Patterns to Identify Generally Desirable Gameplay Features

    Get PDF
    This paper uses a framework of player virtues to perform a theoretical exploration of what is required to make a game good. The choice of player virtues is based upon the view that games can be seen as implements, and that these are good if they support an intended use, and the intended use of games is to support people to be good players. A collection of gameplay design patterns, identified through their relation to the virtues, is presented to provide specific starting points for considering design options for this type of good games. 24 patterns are identified supporting the virtues, including RISK/REWARD, DYNAMIC ALLIANCES, GAME MASTERS, and PLAYER DECIDED RESULTS, as are 7 countering three or more virtues, including ANALYSIS PARALYSIS, EARLY ELIMINATION, and GRINDING. The paper concludes by identifying limitations of the approach as well as by showing how it can be applied using other views of what are preferable features in games

    Learning what matters - Sampling interesting patterns

    Get PDF
    In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.Comment: PAKDD 2017, extended versio

    Flexible constrained sampling with guarantees for pattern mining

    Get PDF
    Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to 1) flexibility in terms of quality measures and constraints that can be used, and/or 2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: 1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and 2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal (ECML/PKDD 2017 journal track

    An automatic critical care urine meter

    Get PDF
    Nowadays patients admitted to critical care units have most of their physiological parameters measured automatically by sophisticated commercial monitoring devices. More often than not, these devices supervise whether the values of the parameters they measure lie within a pre-established range, and issue warning of deviations from this range by triggering alarms. The automation of measuring and supervising tasks not only discharges the healthcare staff of a considerable workload but also avoids human errors in these repetitive and monotonous tasks. Arguably, the most relevant physiological parameter that is still measured and supervised manually by critical care unit staff is urine output (UO). In this paper we present a patent-pending device that provides continuous and accurate measurements of patient’s UO. The device uses capacitive sensors to take continuous measurements of the height of the column of liquid accumulated in two chambers that make up a plastic container. The first chamber, where the urine inputs, has a small volume. Once it has been filled it overflows into a second bigger chamber. The first chamber provides accurate UO measures of patients whose UO has to be closely supervised, while the second one avoids the need for frequent interventions by the nursing staff to empty the containe

    Like trainer, like bot? Inheritance of bias in algorithmic content moderation

    Get PDF
    The internet has become a central medium through which `networked publics' express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.Comment: 12 pages, 3 figures, 9th International Conference on Social Informatics (SocInfo 2017), Oxford, UK, 13--15 September 2017 (forthcoming in Springer Lecture Notes in Computer Science

    Space-Time Structure of Loop Quantum Black Hole

    Full text link
    In this paper we have improved the semiclassical analysis of loop quantum black hole (LQBH) in the conservative approach of constant polymeric parameter. In particular we have focused our attention on the space-time structure. We have introduced a very simple modification of the spherically symmetric Hamiltonian constraint in its holonomic version. The new quantum constraint reduces to the classical constraint when the polymeric parameter goes to zero.Using this modification we have obtained a large class of semiclassical solutions parametrized by a generic function of the polymeric parameter. We have found that only a particular choice of this function reproduces the black hole solution with the correct asymptotic flat limit. In r=0 the semiclassical metric is regular and the Kretschmann invariant has a maximum peaked in L-Planck. The radial position of the pick does not depend on the black hole mass and the polymeric parameter. The semiclassical solution is very similar to the Reissner-Nordstrom metric. We have constructed the Carter-Penrose diagrams explicitly, giving a causal description of the space-time and its maximal extension. The LQBH metric interpolates between two asymptotically flat regions, the r to infinity region and the r to 0 region. We have studied the thermodynamics of the semiclassical solution. The temperature, entropy and the evaporation process are regular and could be defined independently from the polymeric parameter. We have studied the particular metric when the polymeric parameter goes towards to zero. This metric is regular in r=0 and has only one event horizon in r = 2m. The Kretschmann invariant maximum depends only on L-Planck. The polymeric parameter does not play any role in the black hole singularity resolution. The thermodynamics is the same.Comment: 17 pages, 19 figure

    CICLAD: A Fast and Memory-efficient Closed Itemset Miner for Streams

    Full text link
    Mining association rules from data streams is a challenging task due to the (typically) limited resources available vs. the large size of the result. Frequent closed itemsets (FCI) enable an efficient first step, yet current FCI stream miners are not optimal on resource consumption, e.g. they store a large number of extra itemsets at an additional cost. In a search for a better storage-efficiency trade-off, we designed Ciclad,an intersection-based sliding-window FCI miner. Leveraging in-depth insights into FCI evolution, it combines minimal storage with quick access. Experimental results indicate Ciclad's memory imprint is much lower and its performances globally better than competitor methods.Comment: KDD2

    New perspectives on the ecology of tree structure and tree communities through terrestrial laser scanning

    Get PDF
    Terrestrial laser scanning (TLS) opens up the possibility of describing the three-dimensional structures of trees in natural environments with unprecedented detail and accuracy. It is already being extensively applied to describe how ecosystem biomass and structure vary between sites, but can also facilitate major advances in developing and testing mechanistic theories of tree form and forest structure, thereby enabling us to understand why trees and forests have the biomass and three-dimensional structure they do. Here we focus on the ecological challenges and benefits of understanding tree form, and highlight some advances related to capturing and describing tree shape that are becoming possible with the advent of TLS. We present examples of ongoing work that applies, or could potentially apply, new TLS measurements to better understand the constraints on optimization of tree form. Theories of resource distribution networks, such as metabolic scaling theory, can be tested and further refined. TLS can also provide new approaches to the scaling of woody surface area and crown area, and thereby better quantify the metabolism of trees. Finally, we demonstrate how we can develop a more mechanistic understanding of the effects of avoidance of wind risk on tree form and maximum size. Over the next few years, TLS promises to deliver both major empirical and conceptual advances in the quantitative understanding of trees and tree-dominated ecosystems, leading to advances in understanding the ecology of why trees and ecosystems look and grow the way they do

    Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

    Full text link
    Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape
    corecore